IN THE DRAWINGS : 

The attached five (5) Replacement Drawing sheets of formal drawings for Figs. 1- 
5 are submitted herewith to replace the original drawing sheets filed on September 12, 
2003, and the Replacement Drawings filed with the Preliminary Amendment on February 
6, 2004. No new matter has been added. 



Attachment: Replacement Sheets (Figs. 1-5) 



REMARKS 

The Office Action dated April 27, 2007 has been received and carefully noted. 
The above amendments to the claims, and the following remarks, are submitted as a full 
and complete response thereto. 

Figures 1-5 are amended. The specification is amended to correct informalities. 
Claims 1, 5 and 9 are amended to more particularly point out and distinctly claim the 
subject matter of the present invention. New claims 16-21 are added. Support for the 
amendments is found at least on paragraphs [0013] - [0015], [0022], and [0027] of the 
present specification. No new matter is added. Claims 1-21 are respectfully submitted 
for consideration. 

The Office Action objected to Figs. 1-5 as being informal. Applicants respectfully 
submit formal drawing figures 1-5 are submitted in the attached replacement sheets. 
Accordingly, withdrawal of the objection to the drawings is respectfully requested. 

The Office Action objected to the specification because of informalities. 
Applicants respectfully submit that paragraphs [0034] and [0036] are amended to correct 
all known typographical informalities. Accordingly, withdrawal of the objection to the 
specification is respectfully requested. 

The Office Action rejected claims 1-3, 5-7 and 9-11 under 53 U.S.C. 102(e) over 
US Patent No. 6,526, 395 to Morris (Morris). Applicants submit that Morris fails to 
disclose or suggest all of the features recited in any of the pending claims. 
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Claim 1, from which claims 2-4 depend, is directed to a method of speech 
recognition. Audio signals are received from a speech source. Video signals are received 
from the speech source. It is determined if the audio signals can be processed. If the at 
least a portion of audio signals can not be processed, the video signals are processed. At 
least one of the audio signals and the video signals are converted into recognizable 
information. A task is implemented based on the recognizable information. 

Claim 5, from which claims 6-8 depend, is directed to a speech recognition device. 
An audio signal receiver is configured to receive audio signals from a speech source. A 
video signal receiver is configured to receive video signals from the speech source. A 
processing unit is configured to detect if the audio signals can be processed and if so, to 
process the audio signals. The video signals are processed if it is detected that at least a 
portion of the audio signals cannot be processed. A conversion unit is configured to 
convert at least one of the audio signals and the video signals to recognizable 
information. An implementation unit is configured to implement a task based on the 
recognizable information. 

Claim 9, from which claims 10-12 depend, is directed to a system for speech 
recognition. A first receiving means is configured for receiving audio signals from a 
speech source. A second receiving means is configured for receiving video signals from 
the speech source. A processing means is configured for detecting if the audio signals 
can be processed and processing the audio signals if the audio signals can be processed. 
The processing means processes the video signals if at least a portion of the audio signals 
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can not be processed. A converting means is configured for converting at least one of the 
audio signals and the video signals to recognizable information. An implementing means 
is configured for implementing a task based on the recognizable information. 

Applicants submit that each of the pending claims recites features that are neither 
disclosed nor suggested in Morris. 

Morris is directed to an apparatus includes a video input unit and an audio input 
unit. The apparatus also includes a multi-sensor fusion/recognition unit coupled to the 
video input unit and the audio input unit, and a processor coupled to the multi-sensor 
fusion/recognition unit. The multi-sensor fusion/recognition unit decodes a combined 
video and audio stream containing a set of user inputs. Fig. 2 of Morris illustrates input 
interpretation of the system. According to Morris the audio input and video input are 
processed in parallel. See Fig. 2, refs. 206, 208, 210 and 212. In block 210, visual 
gestures are captured from the speech input unit. See col. 4 lines 25-44. 

Applicants submit that Morris fails to disclose or suggest at least the feature of 
determining if the audio signals can be processed and processing the video signal if it is 
detected that at least a portion of the audio signals can not be converted, as recited n 
claims 1, 5, 9 and 13-15. As discussed above, Morris is silent with regards to detecting if 
the audio signals can be processed or converted. Further, Morris describes parallel 
processing of the audio and video signals, regardless as to whether the entire portion of 
the audio signals can be converted. 
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Applicants submit that because claims 2-3, 6-7, and 10-11 depend from claims 1, 
5, and 9, these claims are allowable at least for the same reasons as claims 1, 5, and 9, as 
well as, for the additional features recited in these dependent claims. 

Based at least on the above, Applicants submit that Morris fails to disclose or 
suggest all of the features of claims 1-3, 5-7 and 9-11. Accordingly, withdrawal of the 
rejection under 35 U.S.C. 102(e) is respectfully requested. 

The Office Action rejected claims 13-15 under 35 U.S.C. 103(a) as being obvious 
over Morris, in view of US Patent Publication No. 2004/0109588 to Houvener 
(Houvener). The Office Action took the position that Morris disclosed all of the features 
of these claims except a second processing unit that processes the video signals when a 
segment of the audio signals can not be converted in the recognizable information, 
wherein the video signals coincide with the segment of the audio signals that cannot be 
converted into a recognizable information. The Office Action asserted that "presumably . 
. . Morris would simply do the best it could with whatever information is present, 
including video signals". The Office Action further asserted that Houvener disclosed this 
feature. Applicants submit that the cited references taken individually or in combination, 
fail to disclose or suggest all of the features recited in any of the pending claims. 

Claim 13 is directed to a method of speech recognition. Audio signals are received 
from a speech source. Video signals are received from the speech source. If it is detected 
that the audio signals can be converted into a recognizable format, the audio signals are 
processed and converted into recognizable information. The video signals are processed 
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when a segment of the audio signals can not be converted into the recognizable 
information, wherein the video signals coincide with the segment of the audio signals that 
cannot be converted into the recognizable information. The processed video signals are 
converted into the recognizable information. A task based on the recognizable 
information is implemented. 

Claim 14 is directed to a speech recognition device. An audio signal receiver is 
configured to receive audio signals from a speech source. A video signal receiver is 
configured to receive video signals from the speech source. A first processing unit is 
configured to detect if the audio signals can be converted, and if the audio signals can be 
converted, process the audio signals. A first conversion unit is configured to convert the 
audio signals to recognizable information. A second processing unit is configured to 
process the video signals when the audio signals cannot be converted into the 
recognizable information, wherein the video signals coincide with the segment of the 
audio signals that cannot be converted into the recognizable information. A second 
conversion unit is configured to convert the processed video signals into the recognizable 
information. An implementation unit is configured to implement a task based on the 
recognizable information. 

Claim 1 5 is directed to a system for speech recognition. A first receiving means is 
configured for receiving audio signals from a speech source. A second receiving means is 
configured for receiving video signals from the speech source. A first processing means 
is configured for detecting if the audio signals can be converted, and if the audio signals 
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can be converted, processing the audio signals. A first converting means is configured for 
converting the audio signals into recognizable information. A second processing means is 
configured for processing the video signals when a segment of the audio signals can not 
be converted into the recognizable information, wherein the video signals coincide with 
the segment of the audio signals that cannot be converted into the recognizable 
information. A second converting means is configured for converting the processed video 
signals into the recognizable information. An implementing means is configured for 
implementing a task based on the recognizable information. 

Applicants submit that each of the above claims recites features that are neither 
disclosed nor suggested in any of the cited references. 

Morris is discussed above. Houvener is directed to mobile identity verification 
using tiered biometric analysis. Houvener discloses that a biometric analysis unit 
analyzes the primary biometric data and compares it against known biometric data in the 
database. The biometric analysis unit provides match data that is indicative of whether a 
match exists with respect to the primary biometric data and whether the primary 
biometric data is above a minimum primary biometric data correlation threshold. 

Applicants submit that the cited references fail to disclose or suggest at least the 
features of "processing the video signals when a segment of the audio signals can not be 
converted into the recognizable information, wherein the video signals coincide with the 
segment of the audio signals that cannot be converted into the recognizable information, 
and converting the processed video signals into the recognizable information" as recited 
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in claim 13 and similarly recited in claims 14 and 15. As discussed above, the Office 
Action primarily relied on Houvener to disclose this feature. However, Applicants 
submit that Houvener fails to cure the admitted deficiencies of Morris. 

The presently claimed invention recites the features of detecting if a segment of 
the audio can not be converted into a recognizable format. As stated above, the Office 
Action made the statement that Morris would do the best it can with the information it 
has. Applicants submit that not only is the cited references (most notably Morris) silent 
with regards to the above mentioned feature, the position taken in the Office Action 
disregards the feature that a determination is made if one type of signal (audio) can be 
processed and processing a second type of signal (video) if the first type (audio) can not 
be processed. Morris makes no such determination, and proceeds to process both signals 
in parallel. Applicants further submit that Houvener fails to cure these deficiencies and is 
non-analogous to Morris. 

As discussed above, Houvener merely discloses checking a biometric input against 
stored biometric data. In Houvener, if the primary biometric data fails a comparison test 
with stored biometric data, then a secondary biometric data test is performed. As stated 
above, claims 13-15 recites in part, "wherein the video signals coincide with the segment 
of the audio signals that cannot be converted into the recognizable information." 
(underline added). Houvener merely describes different tiers (levels) of security 
verification. There is no determination as to whether a segment of the audio signals can 
be converted into a recognizable format, and then processing a coinciding video signals. 
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Thus, Applicants submit that the scope of Houvener fails to cure the admitted 
deficiencies of Morris, and the cited references fail to disclose or suggest all of the 
features recited in claims 13-15. 

Applicants further submit that one skilled in the art would not be motivated to 
combine the teachings of Morris and Houvener because the references are non-analogous. 
As discussed above, Morris is directed to using one type of signal (for example audio) in 
combination with another type of signal (for example video) from the same source, to 
increase the quality and accuracy of computer interaction, (i.e., artificial intelligence). 
See col. 1 lines 30-50. On the other hand, as discussed above, Houvener is directed to 
identity verification (i.e., security). Houvener's system merely compares biometric data 
input from a person with stored data to verify a user. The Office Action asserted that "it 
would have been obvious for one skilled in the art to "employ the thresholding method 
for only utilizing secondary biometric signals when the primary biometric signals are 
below a threshold and unrecognizable as taught by Houvener." Applicants respectfully 
traverse this assertion. 

The thresholding system described in Houvener is a tiered approach and the 
biometric inputs are not processed in parallel as those in Morris. Thus, a person skilled in 
the art would not utilize Houvener to cure any deficiencies of Morris. 

Based at least on the above, Applicants submit that the cited references fail to 
disclose or suggest all of the features recited in claims 13-15. Accordingly, withdrawal 
of the rejection under 35 U.S.C. 103(a) is respectfully requested. 
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The Office Action rejected claims 4, 8 and 12 under 35 U.S.C. 103(a) as being 
obvious over Morris, in view of US Patent No. 6,219,639 to Bakis et al. (Bakis). The 
Office Action took the position that Morris disclosed all of the features recited in these 
claims except storing audio and video signals to a destination source and a transmitter for 
sending the audio signals and the video signals to a destination source. The Office 
Action asserted that it is well-known in the art to operate biometric identification via a 
client/server network, where biometric data is stored on a server and biometric data is 
collected locally and compared to stored biometric data on the server. The Office Action 
relied on Bakis in support of this assertion. Applicants submit that the cited references, 
taken individually or in combination, fail to disclose or suggest all of the features recited 
in any of the pending claims. Specifically, Morris is deficient at least for the reasons 
discussed above, and Bakis fails to cure these deficiencies. 

Morris is discussed above. Bakis is directed to recognizing an individual based on 
attributes associated with the individual. Bakis describes pre-storing previously extracted 
biometric attributes which may be later retrieved for the purpose of comparison with 
subsequently extracted biometric attributes to see if a match exists between the two. See 
col. 8 lines 47-53. 

However, Applicants submit that Bakis is merely a cumulative reference when 
combined with Morris. The combination discloses comparing biometric data with stored 
biometric data. Thus, Bakis fails to cure the deficiencies of Morris discussed above. 
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Based at least on the above, Applicants submit that the cited references fail to 
disclose or suggest all of the features of claims 4, 8 and 12. Accordingly withdrawal of 
the rejection under 35 U.S.C. 103(a) is respectfully requested. 

As discussed above, new claims 16-21 are added and supported in the present 
specification. Applicants submit that each of claims 16-21 recite features that are neither 
disclosed nor suggested in any of the cited references. 

Applicants submit that each of claims 1-21 recites features that are neither 
disclosed nor suggested in any of the cited references. Accordingly, it is respectfully 
requested that each of claims 1-21 be allowed, and this application passed to issue. 

If for any reason the Examiner determines that the application is not now in 
condition for allowance, it is respectfully requested that the Examiner contact, by 
telephone, the applicant's undersigned attorney at the indicated telephone number to 
arrange for an interview to expedite the disposition of this application. 
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In the event this paper is not being timely filed, the applicant respectfully petitions 
for an appropriate extension of time. Any fees for such an extension together with any 
additional fees may be charged to Counsel's Deposit Account 50-2222. 
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